Kurtosis removal for data pre-processing
نویسندگان
چکیده
Abstract Mesokurtic projections are linear with null fourth cumulants. They might be useful data pre-processing tools when nonnormality, as measured by the cumulants, is either an opportunity or a challenge. Nonnull cumulants opportunities extreme kurtosis used to identify interesting nonnormal features, for example clusters and outliers. Unfortunately, this approach suffers from curse of dimensionality, which may addressed projecting onto subspace orthogonal mesokurtic projections. challenges using statistical methods whose sampling properties heavily depend on cumulant themselves. ease problem allowing use inferential same under normality. The paper shows necessary sufficient conditions existence compares them other gaussianization methods. Theoretical empirical results suggest that transformations particularly finite normal mixtures. practical illustrated AIS RANDU datasets.
منابع مشابه
Analysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases
Today, a great deal of data is generated in the medical field. Acquiring useful knowledge from this raw data requires data processing and detection of meaningful patterns and this objective can be achieved through data mining. Using data mining to diagnose and prognose heart diseases has become one of the areas of interest for researchers in recent years. In this study, the literature on the ap...
متن کاملOptimized Data Pre-Processing for Discrimination Prevention
Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the ...
متن کاملData Pre-processing for Database Marketing
To increase effectiveness in their marketing and Customer Relationship Manager activities, many organizations are adopting strategies of Database Marketing (DBM). Nowadays, DBM faces new challenges in business knowledge since current strategies are mainly approached by classical statistical inference, which may fail when complex, multi-dimensional and incomplete data is available. An alternativ...
متن کاملPre-processing of Data for Nonlinear Mapping
The sequential nonlinear mapping is suitable for sequential detection of states of dynamic systems (Montvilas, 1999a). In addition, it can indicate the undesirable states and even the damages of dynamic systems. The last is complicated when the damage is caused by a small changing of respective parameter describing the state. In the paper the problem of nonlinear mapping to be sensitive for the...
متن کاملRedundant Data Removal Technique for Efficient Big Data Search Processing
Ranch industry has grown bigger. In Australia, ranches have very large number of livestock commodities: cattle, lambs, and muttons. To manage such a very large scale commodities, they need to install sensor network with MapReduce of Hadoop; since the sensor network generates a huge amount of data. The ranch is divided into several patterned regions and a lot of hubs are installed in there for r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Advances in data analysis and classification
سال: 2022
ISSN: ['1862-5355', '1862-5347']
DOI: https://doi.org/10.1007/s11634-022-00498-3